A Toolchain for Dynamic Function Off-load on CPU-FPGA Platforms

نویسندگان

  • Takaaki Miyajima
  • David B. Thomas
  • Hideharu Amano
چکیده

This new toolchain for accelerating application on CPU-FPGA platforms, called Courier-FPGA, extracts runtime information from a running target binary, and re-constructs the function call graph including input-output data. Then, it synthesizes hardware modules on the FPGA and makes software functions on CPU by using Pipeline Generator. The Pipeline Generator also builds a pipeline control program by using Intel Threading Building Block (Intel TBB) to run both hardware modules and software functions in parallel. Finally, Courier-FPGA’s Function Off-loader dynamically replaces and off-loads the original functions in the binary by using the built pipeline. Courier-FPGA performs the off-loading without user intervention, source code tweaks or re-compilations of the binary. In our case studies, Courier-FPGA was used to accelerate a histogram-of-gradients (HOG) feature detection program on the Zynq platform. A series of functions were off-loaded, and the program was sped up 3.98 times by using the built pipeline.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Automatic Mixed Software Hardware Pipeline Builder for CPU-FPGA Platforms

Our toolchain for accelerating application called Courier-FPGA, is designed for utilize the processing power of CPU-FPGA platforms for software programmers and non-expert users. It automatically gathers runtime information of library functions from a running target binary, and constructs the function call graph including input-output data. Then, it uses corresponding predefined hardware modules...

متن کامل

Parallelizing Workload Execution in Embedded and High-Performance Heterogeneous Systems

In this paper, we introduce a so‰ware-de€ned framework that enables the parallel utilization of all the programmable processing resources available in heterogeneous system-on-chip (SoC) including FPGA-based hardware accelerators and programmable CPUs. Two platforms with di‚erent architectures are considered, and a single C/C++ source code is used in both of them for the CPU and FPGA resources. ...

متن کامل

ThreadPoolComposer - An Open-Source FPGA Toolchain for Software Developers

This extended abstract presents ThreadPoolComposer, a high-level synthesis-based development framework and metatoolchain that provides a uniform programming interface for FPGAs portable across multiple platforms.

متن کامل

3D Recursive Gaussian IIR on GPUs and FPGAs A Case Study for Accelerating Bandwidth-Bounded Applications

GPU devices typically have a higher off-chip bandwidth than FPGA-based systems. Thus typically GPU should perform better for bandwidth-bounded massive parallel applications. In this paper we present our implementations of a 3D recursive Gaussian IIR on multicore CPU, many-core GPU and multi-FPGA platforms. Our baseline implementation on the CPU features the smallest arithmetic computation (2 MA...

متن کامل

Workload distribution and balancing in FPGAs and CPUs with OpenCL and TBB

In this paper we evaluate the performance and energy effectiveness of FPGA and CPU devices for a kind of parallel computing applications in which the workload can be distributed in a way that enables simultaneous computing in addition to simple off loading. The FPGA device is programmed via OpenCL using the recent availability of commercial tools and hardware while Threading Building Blocks (TB...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JIP

دوره 23  شماره 

صفحات  -

تاریخ انتشار 2015